AITopics | unbiased learning

ALarge Scale Search Dataset for Unbiased Learning to Rank

Neural Information Processing SystemsApr-24-2026, 10:11:48 GMT

The unbiased learning to rank (ULTR) problem has been greatly advanced by recent deep learning techniques and well-designed debias algorithms. However, promising results on the existing benchmark datasets may not be extended to the practical scenario due to some limitations of existing datasets. First, their semantic feature extractions are outdated while state-of-the-art large-scale pre-trained language models like BERT cannot be utilized due to the lack of original text. Second, display features are incomplete; thus in-depth study on ULTR is impossible such as the displayed abstract for analyzing the click necessary bias. Third, synthetic user feedback has been adopted by most existing datasets and real-world user feedback is greatly missing. To overcome these disadvantages, we introduce the Baidu-ULTR dataset. It involves randomly sampled 1.2 billion searching sessions and 7,008 expert annotated queries (397,572 query document pairs).

artificial intelligence, machine learning, query, (17 more...)

Neural Information Processing Systems

Country: North America > United States (0.46)

Industry: Information Technology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

07f560092a0edceabf55af32a40eaee3-Paper-Datasets_and_Benchmarks.pdf

Neural Information Processing SystemsFeb-7-2026, 08:15:26 GMT

First,theirsemantic feature extractions are outdated while state-of-the-art large-scale pre-trained language models like BERT cannot be utilized due to the lack of original text.

artificial intelligence, dataset, machine learning, (16 more...)

Neural Information Processing Systems

Country:

North America > United States (0.05)
Europe > Spain > Galicia > Madrid (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Unbiased learning of deep generative models with structured discrete representations

Neural Information Processing SystemsDec-26-2025, 23:09:51 GMT

By composing graphical models with deep learning architectures, we learn generative models with the strengths of both frameworks. The structured variational autoencoder (SVAE) inherits structure and interpretability from graphical models, and flexible likelihoods for high-dimensional data from deep learning, but poses substantial optimization challenges. We propose novel algorithms for learning SVAEs, and are the first to demonstrate the SVAE's ability to handle multimodal uncertainty when data is missing by incorporating discrete latent variables. Our memory-efficient implicit differentiation scheme makes the SVAE tractable to learn via gradient descent, while demonstrating robustness to incomplete optimization. To more rapidly learn accurate graphical model parameters, we derive a method for computing natural gradients without manual derivations, which avoids biases found in prior work. These optimization innovations enable the first comparisons of the SVAE to state-of-the-art time series models, where the SVAE performs competitively while learning interpretable and structured discrete data representations.

deep generative model, name change, unbiased learning, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.65)

Add feedback

Removing Hidden Confounding in Recommendation: A Unified Multi-Task Learning Approach

Neural Information Processing SystemsDec-26-2025, 13:02:59 GMT

In recommender systems, the collected data used for training is always subject to selection bias, which poses a great challenge for unbiased learning. Previous studies proposed various debiasing methods based on observed user and item features, but ignored the effect of hidden confounding. To address this problem, recent works suggest the use of sensitivity analysis for worst-case control of the unknown true propensity, but only valid when the true propensity is near to the nominal propensity within a finite bound. In this paper, we first perform theoretical analysis to reveal the possible failure of previous approaches, including propensity-based, multi-task learning, and bi-level optimization methods, in achieving unbiased learning when hidden confounding is present. Then, we propose a unified multi-task learning approach to remove hidden confounding, which uses a few unbiased ratings to calibrate the learned nominal propensities and nominal error imputations from biased data. We conduct extensive experiments on three publicly available benchmark datasets containing a fully exposed large-scale industrial dataset, validating the effectiveness of the proposed methods in removing hidden confounding.

recommendation, removing hidden confounding, unified multi-task learning approach, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

LBD: Decouple Relevance and Observation for Individual-Level Unbiased Learning to Rank

Neural Information Processing SystemsDec-25-2025, 10:32:38 GMT

Using Unbiased Learning to Rank (ULTR) to train the ranking model with biased click logs has attracted increased research interest. The key idea is to explicitly model the user's observation behavior when building the ranker with a large number of click logs. Considering the simplicity, recent efforts are mainly based on the position bias hypothesis, in which the observation only depends on the position. However, this hypothesis does not hold in many scenarios due to the neglect of the distinct characteristics of individuals in the same position. On the other hand, directly modeling observation bias for each individual is quite challenging, since the effects of each individual's features on relevance and observation are entangled. It is difficult to ravel out this coupled effect and thus obtain a correct relevance model from click data.

artificial intelligence, machine learning, proceedings, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.42)

Add feedback

A Large Scale Search Dataset for Unbiased Learning to Rank

Neural Information Processing SystemsDec-23-2025, 17:36:41 GMT

The unbiased learning to rank (ULTR) problem has been greatly advanced by recent deep learning techniques and well-designed debias algorithms. However, promising results on the existing benchmark datasets may not be extended to the practical scenario due to some limitations of existing datasets. First, their semantic feature extractions are outdated while state-of-the-art large-scale pre-trained language models like BERT cannot be utilized due to the lack of original text. Second, display features are incomplete; thus in-depth study on ULTR is impossible such as the displayed abstract for analyzing the click necessary bias. Third, synthetic user feedback has been adopted by most existing datasets and real-world user feedback is greatly missing. To overcome these disadvantages, we introduce the Baidu-ULTR dataset. It involves randomly sampled 1.2 billion searching sessions and 7,008 expert annotated queries(397,572 query document pairs).

dataset, scale search dataset, unbiased learning, (9 more...)

Neural Information Processing Systems

Country: Asia > Myanmar > Tanintharyi Region > Dawei (0.07)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.59)

Add feedback

LBD: Decouple Relevance and Observation for Individual-Level Unbiased Learning to Rank Mouxiang Chen

Neural Information Processing SystemsAug-19-2025, 07:43:18 GMT

Using Unbiased Learning to Rank (UL TR) to train the ranking model with biased click logs has attracted increased research interest. The key idea is to explicitly model the user's observation behavior when building the ranker with a large number of click logs. Considering the simplicity, recent efforts are mainly based on the position bias hypothesis, in which the observation only depends on the position. However, this hypothesis does not hold in many scenarios due to the neglect of the distinct characteristics of individuals in the same position. On the other hand, directly modeling observation bias for each individual is quite challenging, since the effects of each individual's features on relevance and observation are entangled. It is difficult to ravel out this coupled effect and thus obtain a correct relevance model from click data.

artificial intelligence, machine learning, proceedings, (16 more...)

Neural Information Processing Systems

Country:

North America > United States (0.04)
Europe > Italy > Piedmont > Turin Province > Turin (0.04)
Asia > Singapore (0.04)

Genre: Research Report > Experimental Study (0.46)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Unidentified and Confounded? Understanding Two-Tower Models for Unbiased Learning to Rank

Hager, Philipp, Zoeter, Onno, de Rijke, Maarten

arXiv.org Artificial IntelligenceJun-26-2025

Additive two-tower models are popular learning-to-rank methods for handling biased user feedback in industry settings. Recent studies, however, report a concerning phenomenon: training two-tower models on clicks collected by well-performing production systems leads to decreased ranking performance. This paper investigates two recent explanations for this observation: confounding effects from logging policies and model identifiability issues. We theoretically analyze the identifiability conditions of two-tower models, showing that either document swaps across positions or overlapping feature distributions are required to recover model parameters from clicks. We also investigate the effect of logging policies on two-tower models, finding that they introduce no bias when models perfectly capture user behavior. However, logging policies can amplify biases when models imperfectly capture user behavior, particularly when prediction errors correlate with document placement across positions. We propose a sample weighting technique to mitigate these effects and provide actionable insights for researchers and practitioners using two-tower models.

artificial intelligence, machine learning, two-tower model, (16 more...)

arXiv.org Artificial Intelligence

2506.20501

Country: Europe > Netherlands (0.28)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

Unbiased learning of deep generative models with structured discrete representations

Neural Information Processing SystemsMay-27-2025, 12:44:23 GMT

By composing graphical models with deep learning architectures, we learn generative models with the strengths of both frameworks. The structured variational autoencoder (SVAE) inherits structure and interpretability from graphical models, and flexible likelihoods for high-dimensional data from deep learning, but poses substantial optimization challenges. We propose novel algorithms for learning SVAEs, and are the first to demonstrate the SVAE's ability to handle multimodal uncertainty when data is missing by incorporating discrete latent variables. Our memory-efficient implicit differentiation scheme makes the SVAE tractable to learn via gradient descent, while demonstrating robustness to incomplete optimization. To more rapidly learn accurate graphical model parameters, we derive a method for computing natural gradients without manual derivations, which avoids biases found in prior work.

artificial intelligence, generative model, machine learning, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.40)

Add feedback

Unbiased learning of deep generative models with structured discrete representations

Neural Information Processing SystemsJan-20-2025, 00:26:38 GMT

By composing graphical models with deep learning architectures, we learn generative models with the strengths of both frameworks. The structured variational autoencoder (SVAE) inherits structure and interpretability from graphical models, and flexible likelihoods for high-dimensional data from deep learning, but poses substantial optimization challenges. We propose novel algorithms for learning SVAEs, and are the first to demonstrate the SVAE's ability to handle multimodal uncertainty when data is missing by incorporating discrete latent variables. Our memory-efficient implicit differentiation scheme makes the SVAE tractable to learn via gradient descent, while demonstrating robustness to incomplete optimization. To more rapidly learn accurate graphical model parameters, we derive a method for computing natural gradients without manual derivations, which avoids biases found in prior work.

deep generative model, discrete representation, generative model, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.40)

Add feedback

Filters

Collaborating Authors

unbiased learning

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

ALarge Scale Search Dataset for Unbiased Learning to Rank

07f560092a0edceabf55af32a40eaee3-Paper-Datasets_and_Benchmarks.pdf

Unbiased learning of deep generative models with structured discrete representations

Removing Hidden Confounding in Recommendation: A Unified Multi-Task Learning Approach

LBD: Decouple Relevance and Observation for Individual-Level Unbiased Learning to Rank

A Large Scale Search Dataset for Unbiased Learning to Rank

LBD: Decouple Relevance and Observation for Individual-Level Unbiased Learning to Rank Mouxiang Chen

Unidentified and Confounded? Understanding Two-Tower Models for Unbiased Learning to Rank

Unbiased learning of deep generative models with structured discrete representations

Unbiased learning of deep generative models with structured discrete representations